Search CORE

48 research outputs found

Generation of Application Specific Hardware Extensions for Hybrid Architectures: The Development of PIRANHA - A GCC Plugin for High-Level-Synthesis

Author: Hempel Gerald
Publication venue
Publication date: 11/11/2019
Field of study

Architectures combining a field programmable gate array (FPGA) and a general-purpose processor on a single chip became increasingly popular in recent years. On the one hand, such hybrid architectures facilitate the use of application specific hardware accelerators that improve the performance of the software on the host processor. On the other hand, it obliges system designers to handle the whole process of hardware/software co-design. The complexity of this process is still one of the main reasons, that hinders the widespread use of hybrid architectures. Thus, an automated process that aids programmers with the hardware/software partitioning and the generation of application specific accelerators is an important issue. The method presented in this thesis neither requires restrictions of the used high-level-language nor special source code annotations. Usually, this is an entry barrier for programmers without deeper understanding of the underlying hardware platform. This thesis introduces a seamless programming flow that allows generating hardware accelerators for unrestricted, legacy C code. The implementation consists of a GCC plugin that automatically identifies application hot-spots and generates hardware accelerators accordingly. Apart from the accelerator implementation in a hardware description language, the compiler plugin provides the generation of a host processor interfaces and, if necessary, a prototypical integration with the host operating system. An evaluation with typical embedded applications shows general benefits of the approach, but also reveals limiting factors that hamper possible performance improvements

Technische Universität Dresden: Qucosa

GCC-Plugin for Automated Accelerator Generation and Integration on Hybrid FPGA-SoCs

Author: Castrillon Jeronimo
Hempel Gerald
Hochberger Christian
Vogt Markus
Publication venue
Publication date: 01/09/2015
Field of study

In recent years, architectures combining a reconfigurable fabric and a general purpose processor on a single chip became increasingly popular. Such hybrid architectures allow extending embedded software with application specific hardware accelerators to improve performance and/or energy efficiency. Aiding system designers and programmers at handling the complexity of the required process of hardware/software (HW/SW) partitioning is an important issue. Current methods are often restricted, either to bare-metal systems, to subsets of mainstream programming languages, or require special coding guidelines, e.g., via annotations. These restrictions still represent a high entry barrier for the wider community of programmers that new hybrid architectures are intended for. In this paper we revisit HW/SW partitioning and present a seamless programming flow for unrestricted, legacy C code. It consists of a retargetable GCC plugin that automatically identifies code sections for hardware acceleration and generates code accordingly. The proposed workflow was evaluated on the Xilinx Zynq platform using unmodified code from an embedded benchmark suite.Comment: Presented at Second International Workshop on FPGAs for Software Programmers (FSP 2015) (arXiv:1508.06320

arXiv.org e-Print Archive

TUbiblio

Automatic Creation of High-Bandwidth Memory Architectures from Domain-Specific Languages: The Case of Computational Fluid Dynamics

Author: Christian Pilato
Gerald Hempel
Jeronimo Castrillon
Karl F. A. Friebel
Mattia Tibaldi
Stephanie Soldavini
Publication venue
Publication date: 27/07/2022
Field of study

Numerical simulations can help solve complex problems. Most of these algorithms are massively parallel and thus good candidates for FPGA acceleration thanks to spatial parallelism. Modern FPGA devices can leverage high-bandwidth memory technologies, but when applications are memory-bound designers must craft advanced communication and memory architectures for efficient data movement and on-chip storage. This development process requires hardware design skills that are uncommon in domain-specific experts. In this paper, we propose an automated tool flow from a domain-specific language (DSL) for tensor expressions to generate massively-parallel accelerators on HBM-equipped FPGAs. Designers can use this flow to integrate and evaluate various compiler or hardware optimizations. We use computational fluid dynamics (CFD) as a paradigmatic example. Our flow starts from the high-level specification of tensor operations and combines an MLIR-based compiler with an in-house hardware generation flow to generate systems with parallel accelerators and a specialized memory architecture that moves data efficiently, aiming at fully exploiting the available CPU-FPGA bandwidth. We simulated applications with millions of elements, achieving up to 103 GFLOPS with one compute unit and custom precision when targeting a Xilinx Alveo U280. Our FPGA implementation is up to 25x more energy efficient than expert-crafted Intel CPU implementations

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Developing a Nationwide Infrastructure for Therapeutic Drug Monitoring of Targeted Oral Anticancer Drugs: The ON-TARGET Study Protocol

Author: Fuhr Uwe
Fuxius Stefan
Groenland Stefanie L.
Hempel Georg
Holdenrieder Stefan
Huitema Alwin D. R.
Illerhaus Gerald
Jaehde Ulrich
Joerger Markus
Kloft Charlotte
Mayer Frank
Mc Laughlin Anna M.
Müller Lothar
Opitz Patrick
Scherf-Clavel Oliver
Schmulenson Eduard
Steeghs Neeltje
Teplytska Olga
Zimmermann Sebastian
Publication venue
Publication date: 01/01/2021
Field of study

Exposure-efficacy and/or exposure-toxicity relationships have been identified for up to 80% of oral anticancer drugs (OADs). Usually, OADs are administered at fixed doses despite their high interindividual pharmacokinetic variability resulting in large differences in drug exposure. Consequently, a substantial proportion of patients receive a suboptimal dose. Therapeutic Drug Monitoring (TDM), i.e., dosing based on measured drug concentrations, may be used to improve treatment outcomes. The prospective, multicenter, non-interventional ON-TARGET study (DRKS00025325) aims to investigate the potential of routine TDM to reduce adverse drug reactions in renal cell carcinoma patients receiving axitinib or cabozantinib. Furthermore, the feasibility of using volumetric absorptive microsampling (VAMS), a minimally invasive and easy to handle blood sampling technique, for sample collection is examined. During routine visits, blood samples are collected and sent to bioanalytical laboratories. Venous and VAMS blood samples are collected in the first study phase to facilitate home-based capillary blood sampling in the second study phase. Within one week, the drug plasma concentrations are measured, interpreted, and reported back to the physician. Patients report their drug intake and toxicity using PRO-CTCAE-based questionnaires in dedicated diaries. Ultimately, the ON-TARGET study aims to develop a nationwide infrastructure for TDM for oral anticancer drugs

Institutional Repository of the Freie Universität Berlin

Generation of Application Specific Hardware Extensions for Hybrid Architectures: The Development of PIRANHA - A GCC Plugin for High-Level-Synthesis

Author: Hempel Gerald
Publication venue
Publication date: 11/11/2019
Field of study

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

Generation of Application Specific Hardware Extensions for Hybrid Architectures: The Development of PIRANHA - A GCC Plugin for High-Level-Synthesis

Author: Hempel Gerald
Publication venue
Publication date: 11/11/2019
Field of study

HSSS - Hochschulschriftenserver der SLUB

From Domain-Specific Languages to Memory-Optimized Accelerators for Fluid Dynamics

Author: Christian Pilato
Gerald Hempel
Jeronimo Castrillon
Karl F. A. Friebel
Stephanie Soldavini
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Many applications are increasingly requiring numerical simulations for solving complex problems. Most of these numerical algorithms are massively parallel and often implemented on parallel high-performance computers. However, classic CPU-based platforms suffer due to the demand for higher resolutions and the exponential growth of data. FPGAs offer a powerful and flexible alternative that can host accelerators to complement such platforms. Developing such application-specific accelerators is still challenging because it is hard to provide efficient code for hardware synthesis. In this paper, we study the challenges of porting a numerical simulation kernel onto FPGA. We propose an automated tool flow from a domain-specific language (DSL) to generate accelerators for computational fluid dynamics on FPGA. Our DSL-based flow simplifies the exploration of parameters and constraints such as on-chip memory usage. We also propose a decoupled optimization of memory and logic resources, which allows us to better use the limited FPGA resources. In our preliminary evaluation, this enabled doubling the number of parallel kernels, increasing the accelerator speedup versus ARM execution from 7 to 12 times

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Towards GCC-based automatic soft-core customization

Author: Hempel Gerald
Hochberger Christian
Raitza Michael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

TUbiblio